首页> 外文OA文献 >Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics

【2h】

Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics

机译：用于多智能体强化学习的无标度记忆模型。意思场近似和岩石剪刀动力学

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A continuous time model for multiagent systems governed by reinforcementlearning with scale-free memory is developed. The agents are assumed to actindependently of one another in optimizing their choice of possible actions viatrial-and-error search. To gain awareness about the action value the agentsaccumulate in their memory the rewards obtained from taking a specific actionat each moment of time. The contribution of the rewards in the past to theagent current perception of action value is described by an integral operatorwith a power-law kernel. Finally a fractional differential equation governingthe system dynamics is obtained. The agents are considered to interact with oneanother implicitly via the reward of one agent depending on the choice of theother agents. The pairwise interaction model is adopted to describe thiseffect. As a specific example of systems with non-transitive interactions, atwo agent and three agent systems of the rock-paper-scissors type are analyzedin detail, including the stability analysis and numerical simulation.Scale-free memory is demonstrated to cause complex dynamics of the systems athand. In particular, it is shown that there can be simultaneously two modes ofthe system instability undergoing subcritical and supercritical bifurcation,with the latter one exhibiting anomalous oscillations with the amplitude andperiod growing with time. Besides, the instability onset via this supercriticalmode may be regarded as "altruism self-organization". For the three agentsystem the instability dynamics is found to be rather irregular and can becomposed of alternate fragments of oscillations different in their properties.

机译：建立了具有无标度记忆的强化学习控制的多主体系统的连续时间模型。假定代理人通过试验和错误搜索在优化他们对可能动作的选择方面彼此独立行动。为了获得对行动价值的认识，特工在他们的记忆中积累了在每个时刻采取特定行动所获得的回报。过去奖励对代理当前对动作值的感知的贡献由具有幂律内核的积分算子描述。最后，获得了控制系统动力学的分数阶微分方程。依赖于另一代理的选择，认为代理通过一个代理的报酬隐式地与另一个交互。采用成对交互模型来描述这种效果。作为具有非传递相互作用的系统的一个特定示例，详细分析了剪刀石头布类型的两种媒介和三种媒介系统，包括稳定性分析和数值模拟。无标度存储被证明会导致复杂的动力学过程。系统。特别地，显示出系统不稳定同时发生亚临界和超临界分叉的两种模式，后者表现出异常振荡，其振幅和周期随时间增长。此外，通过这种超临界模式发生的不稳定性可以被认为是“利他主义的自我组织”。对于三主体系统，发现不稳定动力学是相当不规则的，并且可以由性质不同的交替振动碎片组成。

著录项

作者
Lubashevsky, Ihor; Kanemoto, Shigeru;
展开▼
作者单位

展开▼
年度 2010
总页数
原文格式 PDF
正文语种 {"code":"en","name":"English","id":9}
中图分类

相似文献

外文文献
中文文献
专利

1. Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics [J] . Lubashevsky I., Kanemoto S. The European physical journal, B. Condensed matter physics . 2010,第1期

机译：用于多主体强化学习的无标度记忆模型。平均场近似和剪刀石头布动力学
2. Influence of periodic external fields in multiagent models with language dynamics [J] . Filippo Palombi, Stefano Ferriani, Simona Toti PHYSICAL REVIEW E . 2017,第5a6CD期

机译：具有语言动力学的多层模型中周期性外部领域的影响
3. Influence of periodic external fields in multiagent models with language dynamics [J] . Filippo Palombi, Stefano Ferriani, Simona Toti Physical review, E . 2017,第6aPta1期

机译：具有语言动力学的多层模型中周期性外部字段的影响
4. Well-posedness and finite dimensional approximations of a mathematicalmodel for the dynamics of shape-memory alloys, [C] . Ruben D. Spies, Univ. of Minnesota/Twin Cities, Lauderdale, Smart Structures and Materials 1993: Mathematics in Smart Structures . 1993

机译：形状记忆合金动力学数学模型的适定性和有限维近似，
5. Scaling multiagent reinforcement learning. [D] . Proper, Scott. 2010

机译：扩展多主体强化学习。
6. Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions [O] . Minija Tamosiunaite, Tamim Asfour, Florentin Wörgötter -1

机译：通过使用连续动作的基于受体场的函数逼近方法通过强化学习来学习达到
7. Multiagent cooperation and competition with deep reinforcement learning. [O] . Ardi Tampuu, Tambet Matiisen, Dorian Kodelja, 2017

机译：多智能体与强化学习的合作与竞争。

Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅